Tagging gene and protein names in full text articles
نویسندگان
چکیده
Current information extraction efforts in the biomedical domain tend to focus on finding entities and facts in structured databases or MEDLINE abstracts. We apply a gene and protein name tagger trained on Medline abstracts (ABGene) to a randomly selected set of full text journal articles in the biomedical domain. We show the effect of adaptations made in response to the greater heterogeneity of full text.
منابع مشابه
Identifying Experimental Techniques in Biomedical Literature
Named entity recognition of gene names, protein names, cell-lines, and other biologically relevant concepts has received significant attention by the research community. In this work, we considered named entity recognition of experimental techniques in biomedical articles. In our system to mine gene and disease associations, each association is categorized by the techniques used to derive the a...
متن کاملTagging gene and protein names in biomedical text
MOTIVATION The MEDLINE database of biomedical abstracts contains scientific knowledge about thousands of interacting genes and proteins. Automated text processing can aid in the comprehension and synthesis of this valuable information. The fundamental task of identifying gene and protein names is a necessary first step towards making full use of the information encoded in biomedical text. This ...
متن کاملAutomatic extraction of gene and protein synonyms from MEDLINE and journal articles
Genes and proteins are often associated with multiple names, and more names are added as new functional or structural information is discovered. Because authors often alternate between these synonyms, information retrieval and extraction benefits from identifying these synonymous names. We have developed a method to extract automatically synonymous gene and protein names from MEDLINE and journa...
متن کاملEvaluating gene/protein name tagging and mapping for article retrieval
Background: Tagging gene/protein names in text and mapping them to database entries are critical tasks in biological literature mining. Most of the existing tagging and normalization approaches, however, have not been evaluated for practical use in article retrieval towards efficient biocuration. Results: By utilizing literature cross-reference information provided by NCBI Entrez Gene database,...
متن کاملLarge-scale Extraction of Protein/Gene Relations for Model Organisms
This paper presents new results for extraction of relational information from biomedical text. We have improved our system to both capture new types of linguistic constructions as well as new types of biological information (i.e.(de-)phosphorylation). The precision of our system is stable with a slight increase in recall. From almost one million PubMed abstracts related to four model organisms,...
متن کامل